16 research outputs found

    An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays

    Get PDF
    The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques. The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed – a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in Lithuanian speech recognizer. The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed. The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles – in conference proceedings. The results were presented in 17 scientific conferences

    Modified SURF algorithm implementation on FPGA for real-time object tracking

    No full text
    The paper describes the FPGA-based implementation of the modified speeded-up robust features (SURF) algorithm. FPGA was selected for parallel process implementation using VHDL to ensure features extraction in real-time. A sliding 84×84 size window was used to store integral pixels and accelerate Hessian determinant calculation, orientation assignment and descriptor estimation. The local extreme searching was used to find point of interest in 8 scales. The simplified descriptor and orientation vector were calculated in parallel in 6 scales. The algorithm was investigated by tracking marker and drawing a plane or cube. All parts of algorithm worked on 25 MHz clock. The video stream was generated using 60 fps and 640×480 pixel camera. Article in Lithuanian. Modifikuoto požymių vaizde išskyrimo SURF algoritmo objektui sekti realiuoju laiku įgyvendinimas lauku programuojamoje loginėje matricoje Santrauka. Pateikiamas modifikuoto požymių vaizde išskyrimo algoritmo SURF įgyvendinimas lauku programuojamų loginių matricų (LPLM) įrenginiuose. LPLM įrenginiai pasirinkti dėl galimybės tuo pat metu įgyvendinti veikiančius procesus taikant VHDL kalbą. Tai garantuoja, kad požymiai vaizde bus išskirti realiuoju laiku. Skaičiavimams paspartinti taikomas slankusis 84×84 taškų dydžio langas, kuriame saugomas sudėtingas vaizdas. Šio slankiojo lango duomenys taikomi Hessian determinantui, būdingųjų taškų orientacijai ir deskriptoriams apskaičiuoti. Požymiai ieškomi aštuoniose skalėse taikant lokalių ekstremumų paiešką. Požymių orientacijos vektorius ir supaprastintas deskriptorius skaičiuojami šešiose skalėse tuo pat metu. Algoritmo veikimas tiriamas sekant keturių taškų žymeklį ir pagal jį braižant plokštumą arba erdvinį kubą. Skaičiuojama 25 MHz taktiniu dažniu. Vaizdui gauti taikoma 60 kadrų per sekundę dažnio 640×480 taškų raiškos vaizdo kamera. Reikšminiai žodžiai: lauku programuojama loginė matrica, požymių išskyrimas, vaizdo filtravimas, objekto sekimas

    Labeled dataset for bee detection and direction estimation on entrance to beehive

    No full text
    The datasets for bee detection, pose estimation and segmentation consist of organized folders containing both images and corresponding labels. The detection dataset comprises a total of 7200 individual frames collected at 8 different beehives. The pose dataset contains 400 images of bees annotated with two key points per bee. The first point marks a head, second point marks a stinger. All frames have a resolution of 1920×1080 pixels. The segmentation dataset contains 2300 cropped images of bees. These cropped images are annotated with triangular markers that aid in estimating directional vectors. The labels in all proposed datasets were saved in YOLO format. The labeling process was automated by training YOLOv8 model on a set of manually annotated images for bee detection. After detection, all the labels were visually revised and corrected. Frames were captured using stationary mounted camera 30 cm above beehive landing boards. The data collection period spanned from June to July 2023 in Vilnius district

    Investigation of the electromagnetic fields in gyrotropic waveguides

    No full text
    In this paper a general algorithm, program in MATLAB (R) environment and technique to investigate electromagnetic (EM) fields in gyrotropic waveguides are created. To test them distribution of EM fields of modes H-01. HE11, were investigated in dielectric waveguides. Distribution of EM fields of mode HE11, in gyroelectric semiconductor and semiconductor-dielectric n-InSb waveguides are investigated. Dependence of distribution of EM fields of the hybrid main mode HE11, in semiconductor n-InSb waveguides on normalized frequency and external dielectric layer is defined

    mNet2FPGA: A Design Flow for Mapping a Fixed-Point CNN to Zynq SoC FPGA

    No full text
    The convolutional neural networks (CNNs) are a computation and memory demanding class of deep neural networks. The field-programmable gate arrays (FPGAs) are often used to accelerate the networks deployed in embedded platforms due to the high computational complexity of CNNs. In most cases, the CNNs are trained with existing deep learning frameworks and then mapped to FPGAs with specialized toolflows. In this paper, we propose a CNN core architecture called mNet2FPGA that places a trained CNN on a SoC FPGA. The processing system (PS) is responsible for convolution and fully connected core configuration according to the list of prescheduled instructions. The programmable logic holds cores of convolution and fully connected layers. The hardware architecture is based on the advanced extensible interface (AXI) stream processing with simultaneous bidirectional transfers between RAM and the CNN core. The core was tested on a cost-optimized Z-7020 FPGA with 16-bit fixed-point VGG networks. The kernel binarization and merging with the batch normalization layer were applied to reduce the number of DSPs in the multi-channel convolutional core. The convolutional core processes eight input feature maps at once and generates eight output channels of the same size and composition at 50 MHz. The core of the fully connected (FC) layer works at 100 MHz with up to 4096 neurons per layer. In a current version of the CNN core, the size of the convolutional kernel is fixed to 3×3. The estimated average performance is 8.6 GOPS for VGG13 and near 8.4 GOPS for VGG16/19 networks.This article belongs to the Section Artificial Intelligence Circuits and Systems (AICAS)This research is supported by Central Project Management Agency (Vilnius, Lithuania), project number 01.2.2-CPVA-K-703-02-0017

    Energy detector implementaton in FPGA for estimation of word boundaries

    No full text
    This paper describes implementation of the word boundary estimation module in FPGA. The boundary estimation module is based on energy detector. This module is optimized for implementation in FPGA. It occupies 54 logical elements “Slice” and uses only 0.7% of “Spartan-6 LX45” resources. Experiments with this module were performed at different signal/noise (S/N) ratio. For S/N of 20 dB and 15 dB word boundaries were estimated with 100% accuracy. Acceptable results were also achieved, for S/N ratio of 10 dB and 5 dB, as the estimation accuracy was 95% and 93%, respectively. Article in Lithuanian. Energijos detektoriaus, naudojamo žodžio riboms nustatyti, įgyvendinimas lauku programuojama logine matrica Santrauka. Pateikiamas žodžio ribų nustatymo modulio įgyvendinimas lauku programuojama logine matrica (LPLM). Žodžio riboms nustatyti pasirinktas energijos detektorius, nes šis metodas, naudojant skaitmenines signalų apdorojimo priemones, įgyvendinamas efektyviai. Žodžio ribų nustatymo modulis buvo optimizuotas tiek, kad, įgyvendintas LPLM, jis užėmė 54 loginius elementus „Slice“ – tik 0,7 % „Spartan-6 LX45“ lusto išteklių. Eksperimentuojant nustatyta, kad esant 20 dB ir 15 dB signalo triukšmo santykiui, žodžio ribos nustatomos tiksliai, o kai šis santykis yra 10 dB ir 5 dB, žodžio ribos nustatomos 95 % ir 93 % tikslumu. Reikšminiai žodžiai: lauku programuojama loginė matrica, žodžio ribų nustatymas, energijos detektorius, tylusis intervalas, signalo triukšmo santykis

    FPGA-based implementation of Lithuanian isolated word recognition algorithm

    No full text
    The paper describes the FPGA-based implementation of Lithuanian isolated word recognition algorithm. FPGA is selected for parallel process implementation using VHDL to ensure fast signal processing at low rate clock signal. Cepstrum analysis was applied to features extraction in voice. The dynamic time warping algorithm was used to compare the vectors of cepstrum coefficients. A library of 100 words features was created and stored in the internal FPGA BRAM memory. Experimental testing with speaker dependent records demonstrated the recognition rate of 94%. The recognition rate of 58% was achieved for speaker-independent records. Calculation of cepstrum coefficients lasted for 8.52 ms at 50 MHz clock, while 100 DTWs took 66.56 ms at 25 MHz clock. Article in Lithuanian. Lietuvių kalbos pavienių žodžių atpažinimo algoritmo įgyvendinimas lauku programuojama logine matrica Santrauka. Pateikiamas lietuvių kalbos pavienių žodžių atpažinimo algoritmo įgyvendinimas lauku programuojama logine matrica (LPLM). LPLM įrenginys pasirinktas dėl lygiagrečiai veikiančių procesų įgyvendinimo galimybės taikant VHDL kalbą. Tai užtikrina spartų signalų apdorojimą esant taktiniam dažniui iki 50 MHz. Kalbos požymiams išskirti taikoma kepstrinė šnekos analizė. Požymiams palyginti taikomas dinaminis laiko skalės kraipymo (DSLK) metodas. Sudaryta 100 žodžių požymių biblioteka, kuri saugoma vidinėje LPLM BRAM atmintyje. Pasiektas 94 % atpažinimo tikslumas priklausomai nuo kalbėtojo ir 58 % – nepriklausomai nuo kalbėtojo. Kepstro koeficientų skaičiavimas vienam žodžiui trunka 8,52 ms, esant 50 MHz taktiniam dažniui, ir šimtui DLSK – 66,56 ms, esant 25 MHz taktiniam dažniui. Reikšminiai žodžiai: lauku programuojama loginė matrica, žodžio atpažinimas, kepstras, dinaminis laiko skalės kraipymas

    FPGA implementation of range addressable activation function for lattice-ladder neuron

    No full text
    FPGA implementation of hyperbolic tangent activation function for multilayer perceptron structure seems attractive; however, there is a lack of preliminary results on the choice of memory size particularly, when LUT of the function is stored in dedicated on-chip block RAM. The aim of this investigation was to get insights on the distortions of the selected neuron model output by the evaluation of transfer function RMS error and neuron output signal mean and maximum errors while changing the gain and memory size of the activation function. Thus, the range addressable activation function for the second order normalized lattice-ladder neuron was implemented in Artix-7 FPGA. Various gain and memory constrains were investigated. The increase of LUT memory size and gain yielded smaller error of output signal and nonlinear influence on the transfer function. 2 kB of BRAM is sufficient to achieve tolerable less than 0.4 % maximum error utilizing only 0.36 % of total on-chip block memory

    FPGA Implementation of a Convolutional Neural Network and Its Application for Pollen Detection upon Entrance to the Beehive

    No full text
    The condition of a bee colony can be predicted by monitoring bees upon hive entrance. The presence of pollen grains gives beekeepers significant information about the well-being of the bee colony in a non-invasive way. This paper presents a field-programmable-gate-array (FPGA)-based pollen detector from images obtained at the hive entrance. The image dataset was acquired at native entrance ramps from six different hives. To evaluate and demonstrate the performance of the system, various densities of convolutional neural networks (CNNs) were trained and tested to find those suitable for pollen grain detection at the chosen image resolution. We propose a new CNN accelerator architecture that places a pre-trained CNN on an SoC FPGA. The CNN accelerator was implemented on a cost-optimized Z-7020 FPGA with 16-bit fixed-point operations. The kernel binarization and merging with the batch normalization layer were applied to reduce the number of DSPs in the multi-channel convolutional core. The estimated average performance was 32 GOPS for a single convolutional core. We found that the CNN with four convolutional and two dense layers gave a 92% classification accuracy, and it matched those declared for state-of-the-art methods. It took 8.8 ms to classify a 512 × 128 px frame and 2.4 ms for a 256 × 64 px frame. The frame rate of the proposed method outperformed the speed of known pollen detectors. The developed pollen detector is cost effective and can be used as a real-time image classification module for hive status monitoring

    Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal

    No full text
    This paper discusses an algorithm that attempts to automatically calculate the effect of room reverberation by training a mathematical model based on a recurrent neural network on anechoic and reverberant sound samples. Modelling the room impulse response (RIR) recorded at a 44.1 kHz sampling rate using a system identification-based approach in the time domain, even with deep learning models, is prohibitively complex and it is almost impossible to automatically learn the parameters of the model for a reverberation time longer than 1 s. Therefore, this paper presents a method to model a reverberated audio signal in the frequency domain. To reduce complexity, the spectrum is analyzed on a logarithmic scale, based on the subjective characteristics of human hearing, by calculating 10 octaves in the range 20–20,000 Hz and dividing each octave by 1/3 or 1/12 of the bandwidth. This maintains equal resolution at high, mid, and low frequencies. The study examines three different recurrent network structures: LSTM, BiLSTM, and GRU, comparing the different sizes of the two hidden layers. The experimental study was carried out to compare the modelling when each octave of the spectrum is divided into a different number of bands, as well as to assess the feasibility of using a single model to predict the spectrum of a reverberated audio in adjacent frequency bands. The paper also presents and describes in detail a new RIR dataset that, although synthetic, is calibrated with recorded impulses
    corecore